Jointly Labeling Multiple Sequences: A Factorial HMM Approach

نویسنده

  • Kevin Duh
چکیده

We present new statistical models for jointly labeling multiple sequences and apply them to the combined task of partof-speech tagging and noun phrase chunking. The model is based on the Factorial Hidden Markov Model (FHMM) with distributed hidden states representing partof-speech and noun phrase sequences. We demonstrate that this joint labeling approach, by enabling information sharing between tagging/chunking subtasks, outperforms the traditional method of tagging and chunking in succession. Further, we extend this into a novel model, Switching FHMM, to allow for explicit modeling of cross-sequence dependencies based on linguistic knowledge. We report tagging/chunking accuracies for varying dataset sizes and show that our approach is relatively robust to data sparsity.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Joint Labeling of Multiple Sequences: A Factorial HMM Approach

Various sequence-labeling tasks in natural language processing require the cascading of error-prone subtasks. For instance, in syntactic analysis of sentences, part-of-speech (POS) tagging results are often used for noun-phrase segmentation (NP chunking), even though initial errors may hurt downstream processing. To mitigate this problem, I use a factorial hidden Markov Model (FHMM), where the ...

متن کامل

Supertagging with Factorial Hidden Markov Models

Factorial Hidden Markov Models (FHMM) support joint inference for multiple sequence prediction tasks. Here, we use them to jointly predict part-of-speech tag and supertag sequences with varying levels of supervision. We show that supervised training of FHMM models improves performance compared to standard HMMs, especially when labeled training data is scarce. Secondly, we show that an FHMM and ...

متن کامل

معرفی شبکه های عصبی پیمانه ای عمیق با ساختار فضایی-زمانی دوگانه جهت بهبود بازشناسی گفتار پیوسته فارسی

In this article, growable deep modular neural networks for continuous speech recognition are introduced. These networks can be grown to implement the spatio-temporal information of the frame sequences at their input layer as well as their labels at the output layer at the same time. The trained neural network with such double spatio-temporal association structure can learn the phonetic sequence...

متن کامل

An excitation model for HMM-based speech synthesis based on residual modeling

This paper describes a trainable excitation approach to eliminate the unnaturalness of HMM-based speech synthesizers. During the waveform generation part, mixed excitation is constructed by state-dependent filtering of pulse trains and white noise sequences. In the training part, filters and pulse trains are jointly optimized through a procedure which resembles analysis-bysynthesis speech codin...

متن کامل

Transformation streams and the HMM error model

The most popular model used in automatic speech recognition is the hidden Markov model (HMM). Though good performance has been obtained with such models there are well known limitations in its ability to model speech. A variety of modifications to the standard HMM topology have been proposed to handle these problems. One approach is the factorial HMM. This paper introduces a new form of factori...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005